-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[geometric]Move graph-related incubate api to geometric #44970
Conversation
6a84139
to
554485a
Compare
from paddle import _C_ops | ||
|
||
|
||
def graph_reindex(x, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里计划加一个 heter_graph_reindex,把同构图和异构图的 reindex 区分开来。heter_graph_reindex 就输入 List of Tensor,然后直接在 API 层直接concat 成 Tensor。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已增加
python/paddle/geometric/math.py
Outdated
out, tmp = _C_ops.segment_pool(data, segment_ids, 'pooltype', "SUM") | ||
return out | ||
|
||
check_variable_and_dtype(data, "X", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
segment 目前是否支持float16了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
原子操作支持了就可以了,我增加一下。
reindex_src, reindex_dst, out_nodes = \ | ||
_C_ops.graph_reindex(x, neighbors, count, value_buffer, index_buffer, | ||
"flag_buffer_hashtable", flag_buffer_hashtable) | ||
return reindex_src, reindex_dst, out_nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里有个因为,目前不在区分新、老动态图的目的是什么了?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
因为 graph_reindex 这些操作可能没有反向,框架侧同学没有增加支持新动态图。
be int32, and should be filled with -1. | ||
index_buffer (Tensor|None): Index buffer for hashtable. The data type should | ||
be int32, and should be filled with -1. | ||
flag_buffer_hashtable (bool): Whether to use buffer for hashtable to speed up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的变量名字可以再想想,既然都是bool类型了,名字加上了一个flag比较奇怪,has_buffer_hastable ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改成has_buffer_hastable
check_variable_and_dtype(count, "Count", ("int32"), "graph_reindex") | ||
|
||
if flag_buffer_hashtable: | ||
check_variable_and_dtype(value_buffer, "HashTable_Value", ("int32"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里只是支持int32为啥了? 看代码逻辑,这里应该是需要支持int64
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这块我目前在 C++层面是只设置支持了 int32。因为担心如果一个图的节点数量过多,则申请的 hashtable 空间会很大,此时其实也不适合用这种方式加速了。不过后面可以找时间把这块的支持放上去。这次先暂时略过。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: 增加 int64支持。
from paddle import _C_ops | ||
|
||
|
||
def sample_neighbors(row, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的API名字,是否要考虑到后续的API的接入了?这个名字看起来一种GraphSage的采样的方式,这里需要讨论一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
参考 dgl,把这个文件名改成了 neighbors.py
sample_size (int): The number of neighbors we need to sample. Default value is | ||
-1, which means returning all the neighbors of the input nodes. | ||
return_eids (bool): Whether to return eid information of sample edges. Default is False. | ||
flag_perm_buffer (bool): Using the permutation for fisher-yates sampling in GPU. Default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
同上
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已修改
c0b5157
to
d5ac560
Compare
has_perm_buffer=False, | ||
name=None): | ||
""" | ||
Graph Sample Neighbors API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里整体上是新增了API,文档先预览一下,粘贴到PR描述中。
from paddle.fluid.layer_helper import LayerHelper | ||
from paddle.fluid.framework import _non_static_mode | ||
from paddle.fluid.data_feeder import check_variable_and_dtype | ||
from paddle.fluid import core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from paddle.fluid import core
用到了吗?
geometric 目录整体,可以用 pylint 扫一遍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
input_nodes, | ||
eids=None, | ||
perm_buffer=None, | ||
sample_size=-1, | ||
return_eids=False, | ||
has_perm_buffer=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里的 参数顺序,是如何考虑,perm_buffer
是一个较高使用频率的参数吗》
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不是,但是不知道参数顺序有么有要求 Tensor 放前面,attribute 放后面?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
已调整参数顺序
'graph_reindex', | ||
'heter_graph_reindex', | ||
'khop_sampler', | ||
'sample_neighbors', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
感觉这里的命名,不是很对齐
67af714
to
e9c50fb
Compare
@@ -381,15 +379,14 @@ def send_uv(x, y, src_index, dst_index, message_op="add", name=None): | |||
src_index (Tensor): An 1-D tensor, and the available data type is int32, int64. | |||
dst_index (Tensor): An 1-D tensor, and should have the same shape as `src_index`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
send_ue_recv 的除法有潜在 除 0 错误。
部分api 注释中 code 部分,定义数据的重复代码比较多。
if message_op == 'sub':
message_op = 'add'
y = -y
if message_op == "div":
message_op = 'mul'
y = 1. / y
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api 加到 __all__
中
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
应该不加, api调用不会走到 send_recv.xxx 这一级别的。所以泽阳让我限制__all__等于空。
def heter_graph_reindex(x, | ||
neighbors, | ||
count, | ||
value_buffer=None, | ||
index_buffer=None, | ||
has_buffer_hashtable=False, | ||
name=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def graph_reindex(x,
neighbors,
count,
value_buffer=None,
index_buffer=None,
has_buffer_hashtable=False,
name=None):
def heter_graph_reindex(x,
neighbors,
count,
value_buffer=None,
index_buffer=None,
has_buffer_hashtable=False,
name=None):
两个API统一成一个是否简洁
def reindex_graph(x,
neighbors,
count,
heter_graph=False,
value_buffer=None,
index_buffer=None,
has_buffer_hashtable=False,
name=None):
fd6ee6b
to
c8b50fb
Compare
perm_buffer (Tensor): Permutation buffer for fisher-yates sampling. If `has_perm_buffer` | ||
is True, then `perm_buffer` should not be None. The data type should | ||
be the same with `row`. Default is None. | ||
has_perm_buffer (bool): Using the permutation for fisher-yates sampling in GPU. Default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这里可以用None 来判断是否使用,perm_buffer
减少一个参数?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR types
New features
PR changes
APIs
Describe
Move graph-related API to paddle.geometric.